Validation of the Emergency Department-Paediatric Early Warning Score (ED-PEWS) for use in low- and middle-income countries: A multicentre observational study

Early recognition of children at risk of serious illness is essential in preventing morbidity and mortality, particularly in low- and middle-income countries (LMICs). This study aimed to validate the Emergency Department-Paediatric Early Warning Score (ED-PEWS) for use in acute care settings in LMICs. This observational study is based on previously collected clinical data from consecutive children attending four diverse settings in LMICs. Inclusion criteria and study periods (2010–2021) varied. We simulated the ED-PEWS, consisting of patient age, consciousness, work of breathing, respiratory rate, oxygen saturation, heart rate, and capillary refill time, based on the first available parameters. Discrimination was assessed by the area under the curve (AUC), sensitivity and specificity (previously defined cut-offs < 6 and ≥ 15). The outcome measure was for each setting a composite marker of high urgency. 41,917 visits from Gambia rural, 501 visits from Gambia urban, 2,608 visits from Suriname, and 1,682 visits from Tanzania were included. The proportion of high urgency was variable (range 4.6% to 24.9%). Performance ranged from AUC 0.80 (95%CI 0.70–0.89) in Gambia urban to 0.62 (95%CI 0.55–0.67) in Tanzania. The low-urgency cut-off showed a high sensitivity in all settings ranging from 0.83 (95%CI 0.81–0.84) to 1.00 (95%CI 0.97–1.00). The high-urgency cut-off showed a specificity ranging from 0.71 (95%CI 0.66–0.75) to 0.97 (95%CI 0.97–0.97). The ED-PEWS has a moderate to good performance for the recognition of high urgency children in these LMIC settings. The performance appears to have potential in improving the identification of high urgency children in LMICs.


Introduction
In high-and low-income countries millions of patients seek emergency care yearly, approximately a quarter of whom are children.In these acute care settings, recognizing the child with serious illness or at risk of deterioration is a major clinical challenge [1].In hospitals in lowincome countries about 50% of deaths of children occur in the first 24 hours after admission [2][3][4].Therefore, effective and adequate prioritisation of children based on their urgency is crucial to ensure that the most severely ill children are identified early and accurately, and resources are efficiently allocated.Better prioritisation of high urgent children starts with accurate triage upon first presentation.
Several triage systems have been developed specifically for use in low-and middle-income countries (LMICs), such as the Emergency Triage Assessment and Treatment (ETAT) [5].In contrast to conventional triage systems, Paediatric Early Warning Scores (PEWS) are simple scoring systems solely based on physiological parameters, aimed to assess the severity of illness and risk of clinical deterioration.PEWS are quick, easily applicable at the bedside, and consist of objective parameters, which is an advantage in settings where patients speak different local languages.
Research on the use of PEWS in LMICs is scarce and there is no consensus on which PEWS has the best performance (S1 File).A promising tool is the recently developed Emergency Department-Paediatric Early Warning Score (ED-PEWS) [6].This is the first score fully based on statistical modelling and specifically developed for acute care settings.In the original study, the ED-PEWS showed a good performance for identifying both high and low urgency patients in a large European cohort including more than 100,000 paediatric visits.However, performance of the ED-PEWS in LMICs is unknown.Therefore, the aim of this study is to validate the ED-PEWS in different low-and middle income settings.

Study design
This study is an observational study, based on four databases with previously collected clinical data from The Gambia (rural and urban), Suriname and Tanzania.All databases have previously been created for research purposes [7][8][9][10][11][12].The data were collected during routine clinical care or during clinical care as part of a research setting.The local medical ethical committees approved of the data collection: Gambia Government/MRCG Joint Ethics Committee, Project ID: 24006 (Gambia rural), and 22470 (Gambia urban); Commissie Mensgebonden Wetenschappelijk Onderzoek Ministerie van Volksgezondheid Paramaribo, 01 februari 2021; and Ifakara Health Institute (IHI/IRB/No: 12-2014).The requirement for informed consent was waived in The Gambia rural and Suriname.A declaration of exemption ("verklaring niet-WMO plichtig") from the Erasmus MC Ethics committee has been issued.We followed the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) statement for reporting [13].

Study sites and population
We recruited patient cohorts that consisted of consecutive children attending primary and emergency care settings in LMICs and had data collected on physiological parameters and markers of severity [14].This resulted in data from four cohorts from heterogeneous acute care settings in three different countries (Table 1).In short data was collected from 1) the MRC Keneba Fieldstation, a primary healthcare clinic in a rural region Kiang West in The Gambia (hereafter: Gambia rural); 2) MRC Fajara and Kanifing general hospital, two urban tertiary care clinics in The Gambia (Gambia urban); 3) Academic hospital Paramaribo, a tertiary care hospital in an urban area in Suriname (Suriname); and 4) three secondary care district hospitals and six primary health care centres in Tanzania (Tanzania).The inclusion criteria of these databases were defined by the original study for which the data was collected.The most important inclusion criteria were fever (Tanzania and Gambia urban) and age under 60 months (Tanzania).All settings were open access and there was no referral necessary by another physician.All duplicates, children with missing data on age, and in Gambia rural children with mothers under fifteen years of age were excluded.Furthermore, we did not include routine visits unrelated to emergency care, e.g.immunisations and follow-up visits.All databases were checked for quality and implausible values.The quality of the data was assessed frequency tables of the variables, and specifically by assessing outliers in each age category.Additionally, discrepancies in the data were discussed with the local research teams.Data checks and interpretation took place in close collaboration with the local research teams.For each of the settings we scheduled a minimum of two online meetings to discuss the database and findings.

Vital signs and ED-PEWS
We simulated the ED-PEWS using the available vital signs data.In all settings, vital signs were taken on presentation by trained personnel according to local practices (S2 File).Only the first set of clinical observations for each child were included.The ED-PEWS was retrospectively calculated using the following vital signs and physiological parameters: age, heart rate, respiratory rate, oxygen saturation, capillary refill time, consciousness, and work of breathing.To each vital sign a score was assigned and all scores were added up to reach the total score.This total score can range from 0-68 points (S3 File) [6].Capillary refill time was not available in the Tanzanian data.If a variable was not available or missing, we scored the variable as normal.
Due to the fact that all variables had less than 5% missing data a complete records analysis was considered acceptable for the logistic regression analysis and furthermore this is a representation of how the score would be used in clinical practice because an unavailable item would not contribute to the total score [15].
In addition to the vital signs, we collected data on existing triage methods and whether staff considered a child as ill appearing.In all settings except for Tanzania, patients were routinely triaged using a conventional triage system.In Gambia rural, triage was performed using the 3-level Emergency Triage Assessment and Treatment (ETAT) [5].In Suriname the 5-level system the Emergency Severity Index (ESI), and in Gambia urban, the 5-level Integrated Management of Childhood Illness (IMCI) classifications were used [5,16].Both the ESI and the ETAT make use of vital parameters or derivatives of vital parameters in their respective triage systems.Ill appearance was assessed by the attending healthcare workers based on their own judgement.

Outcome measures
The outcome measure representing high urgent children was defined for each setting separately.We used the original outcome measure from the ED-PEWS derivation study as starting point (S4 File), and together with the local research teams, adjusted the different items to the local situation, taking into account the available data.In all settings, a marker of a higher level of care, for example intensive care unit (ICU) admission or referral to secondary care, was selected if available.Furthermore, admission, and in some settings intravenous (IV) medication were part of the outcome measure.Additional available markers for high urgency in Suriname were oxygen administration, inhalation medication and immediate lifesaving interventions (Table 2) [17,18].All variables had been collected prior to start of the study, and thus there was no need for blind assessment of predictor and outcome variables.

Statistical analysis
The data were statistically analysed with SPSS version 25.In a descriptive analysis, the baseline characteristics of the presenting children were explored.Subsequently, univariable and multivariable logistic regression models were used to calculate the association between the individual predictors of the ED-PEWS and the outcome, adjusted for age, sex and other vital parameters.Temperature was not included in the analyses for Gambia urban and Tanzania because fever was an inclusion criterion for these databases.The analyses from Gambia rural were also adjusted for free access to transportation to the clinic, because this was an important confounder in this population in previous research [8,19].Free access to transportation was defined as children living in Keneba, Manduar or Kantong Kunda, who received free weekly transport to the clinic during the study period.Discrimination of the ED-PEWS was quantified by the area under the curve (AUC).We were unable to determine measures of calibration (the agreement between observed outcomes and predictions), because our outcome measure differed from the original derivation study.To assess the ED-PEWS' diagnostic accuracy, we calculated sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio and negative likelihood ratio.In the original validation of the ED-PEWS in European data, the cut-off for low-urgency was defined as score <6, and a score �15 was defined as the cut-off for high urgency [6].These pre-determined cut-offs were used for the diagnostic accuracy calculations.Furthermore, we explored a range of potentially relevant cut-off values for high and low urgency.We defined the optimal cut-off as sensitivity > 0.9 for low-urgency and a specificity of > 0.9 for high urgency.Subgroup-analyses were performed to assess the performance of the ED-PEWS in children under five years, and in children with fever, defined as a temperature above 38.0˚C.The results from the ED-PEWS were compared to Egdell's Paediatric Advanced Warning score (PAWS) (S5 File) and the existing triage system in each setting [20].The Egdell-PAWS is one of the few PEWS specifically designed for use in acute care settings, although it was developed in a high-income country.The PAWS was retrospectively calculated based on the following variables: age-adjusted respiratory rate, age-adjusted heart rate, saturation, temperature, consciousness, capillary refill time and increased work of breathing.We had to adjust the levels of consciousness based on the variables available in the database.
The study size was based on a convenience sample of four databases from LMICs.Due to the fact that all the data was collected prior to the study, we only assessed if we had enough cases to confidently draw any conclusions about the data.The most important considerations were the amount of high urgent children.For external validation, simulation studies have suggested a minimum of 100 events, which would mean high-urgent children, and 100 non-events or low-urgent children, to obtain reliable estimates of model performance [21,22].

Role of the funding source
The funder of the study had no role in study design, data collection, data analysis, data interpretation or writing of the report.

Baseline characteristics
Data were available from 54,192 visits to one of the four different study sites.A total of 7,537 visits (14%) were excluded, the majority because they were defined as routine visits not related to emergency care (n = 7,429).After application of the exclusion criteria, 46,652 visits remained: 41,917 visits from Gambia rural, 501 visits from Gambia urban, 2,608 visits from Suriname and 1,596 visits from Tanzania (Table 3 and S6 File).The median patient age ranged from 1.1 to 4.2 years and the majority of children were boys (range 51.6% to 56.4%).Following the initial visit, the percentage of children admitted to hospital including the ICU ranged from 3.4% to 14.3%.The proportion of children with the high urgency outcome as described in Table 2, ranged from 4.6% (Gambia Urban) to 24.8% (Suriname).In all centres, missing rates of physiological parameters were low (S7 and S8 Files).
All variables, except capillary refill time in Gambia rural, and work of breathing in Tanzania, were significantly associated with the outcome measures in the multivariable analysis (S9 File).The strongest predictors were consciousness with an adjusted OR ranging from 1.8 (95% CI 1.3-2.5) to 5.5 (95%CI 3.3-9.3)and work of breathing with an adjusted OR ranging from 2.6 (95%CI 2.0-3.3) to 9.6 (95%CI 6.4-14.3).

Performance of the ED-PEWS
Performance of the ED-PEWS was highly variable in the different settings.The AUC was highest in Gambia urban (0.80 (95%CI 0.70-0.89)),followed by Suriname (0.78 (95%CI 0.76-  4).When looking at the different subgroups, performance was better in the children under five years compared with older children.There was no consistent trend in performance in the subgroup of children with fever.In all settings, discrimination of the ED-PEWS was better than the Egdell-PAWS and the existing local triage methods, except for the existing triage in Gambia urban (Table 4).The earlier defined low-urgency cut-off point of < 6 showed a high sensitivity in all settings ranging from 0.83 (95%CI 0.81-0.84) to 1.00 (95%CI 0.97-1.00)and identified 2.7% to 33.0% of all visits as low urgent (Table 5).The earlier defined high-urgency cut-off point of �15 showed a specificity ranging from 0.71 (95%CI 0.66-0.75) to 0.97 (95%CI 0.97-0.97)and classified 3.4% to 30.9% as high urgent.The optimal cut-off points were only explored in Gambia rural, Gambia urban and Suriname, because the restricted age group in Tanzania made it impossible to define a representative cut-off (S10 File).The earlier defined cut-off points were optimal for Suriname and a reasonable choice for Gambia rural and Gambia urban.A better cut-off point in Gambia rural would be � 10 for high urgency, classifying 10.1% as high urgent, with a specificity of 0.91 (95%CI 0.91-0.91),and in Gambia urban � 20 for high urgency, classifying 8.4% as high urgent, with a specificity of 0.93 (95%CI 0.91-0.95) and < 10 for low urgency, classifying 28.5% as low urgent, with a sensitivity of 0.96 (95%CI 0.78-1.00).

Discussion
This observational study shows that performance of the ED-PEWS is variable in four heterogeneous acute care settings in LMICs, ranging from a moderate to good discriminative ability.Despite the variability in performance, the ED-PEWS mostly outperforms existing triage and the Egdell-PAWS.The predefined cut-off points of high and low urgency seem reasonable, but recalibration of the model by selecting the optimal cut-off values can improve performance for different settings.The most important predictors of high urgency in our study population are consciousness and work of breathing.
In the original study on the development of the ED-PEWS, cross-validation of the score in five European emergency departments resulted in an AUC of 0.86 (95%CI 0.84-0.88)for high urgency and 0.67 (95%CI 0.64-0.69)for high and intermediate urgency [6].Because the initial study used a three-category reference standard, the performance in LMICs cannot be directly compared.What is most remarkable, however, are the large differences in performance between the LMIC settings themselves.Our study has not been designed to further evaluate the reasons behind the differences.However, it can be noted that the two settings that show the best performance, Gambia urban and Suriname are the settings with the highest level of resources, providing second-and third-line care.It is known that a prediction rule generally performs best in a population most similar to the original validation population [23,24].The outcome measure for the Suriname setting was also most comparable to the one from the original derivation study.The different inclusion criteria of the different cohorts do not seem play a role in the differences in performance.Fever was an inclusion criterion for both cohorts from Gambia Urban and Tanzania, but performance in these settings was very different.
Although performance of the ED-PEWS is worst in the primary care settings in Tanzania and Gambia rural, its performance is still better than the original local triage system and the Egdell-PAWS for predicting high acuity in children.This might indicate that in these populations, scores solely based on vital parameters may be suboptimal to assess children's urgency.This could be due to differences in the prevalence and severity of the presenting conditions.Alternatively, other factors may play a role in determining the urgency of a child, such as body composition or nutritional status [7,23].
The most important predictors from the original study were consciousness and work of breathing, which is comparable to our findings in LMICs.However, although capillary refill time was a significant predictor in the original study, in the current analysis it was only significant in Suriname but not in Gambia rural and contributed little in Gambia urban.This could be due to differences in measurement, for example as a consequence of differences in skin colour or rates of anaemia, or could be related to the variability in underlying conditions.

Strengths and limitations
To the best of our knowledge, the current study is one of the few evaluating a PEWS in LMICs in a multicentre study [25].We were able to include large datasets from four diverse acute care settings.
Earlier research on the development and use of PEWS has mostly been performed in large tertiary care hospitals in LMICs [26][27][28][29][30]. Our study includes two databases with data from primary care settings, one from a rural area, which is unique for research in LMICs.Other studies on the validation of a PEWS in LMICs mostly focused on a single outcome measure, such as mortality, and used PEWS based on expert opinion.This study focuses on a combined outcome measure to better assess high urgency and uses a PEWS based on statistical analyses.Furthermore, the carefully collected clinical data and the low number of missing values across all settings are a strength of this study.
Our study also has some limitations.There is no gold standard that reflects patient urgency in acute care settings [31].Therefore we used the reference standard that was applied in the original ED-PEWS validation and adapted this standard to the different settings [6].In close collaboration with the local research teams and based on the available data, we developed an outcome measure that was the best possible reflection of patient urgency in these different centres.However, these outcome measures are not completely objective, indicating that different healthcare workers may make different decisions whether to refer, hospitalize or administer parenteral drugs for a child.These differences in outcome measure could explain the difference in high-urgent children in the settings together with the level of care and healthcare system in the country.In addition, for some items there is a risk of circularity, because vital signs influence treatment decisions.However, because the ED-PEWS consists of multiple items combined in a composite score, no vital sign alone is able to predict the outcome.Also, the cut-offs for the treatment decisions and the cut-offs used in the ED-PEWS are likely different.
The databases from Gambia urban and Tanzania only included children with fever.Therefore, we could not draw any conclusions about the general ED population in these settings.Also, the Tanzanian database did not include capillary refill time, which is a limitation of this data.Moreover, there was a limited number of outcomes in the cohort from Gambia Urban.Finally, due to the heterogeneity of the settings we were unable to pool the results and had to present the results of each setting separately.
Due to the fact that we used existing databases, we had no direct effect on the accuracy of measurements, however all of the original studies tried to minimalize this bias by assuring complete and accurate data.
Our study compares performance of the ED-PEWS with the locally used triage systems, amongst others the widely used ETAT, ESI, and IMCI guidelines.This comparison is solely based on the recognition of high urgency children, and it must be noted that these systems have a much broader use.For example, the WHO ETAT also functions as a mechanism to identify and start treatment of children with life threatening conditions, which we did not assess in our study.

Future research
The majority of children in the world live in LMICs [32].The implementation of PEWS in these countries may provide better identification of the sickest children and better allocation of resources, which could decrease mortality and morbidity.In health care settings in LMICs with a higher level of care, adding the ED-PEWS to the first assessment of children appears promising.However, additional external validation of the ED-PEWS is needed, preferably in an implementation study to assess its true clinical value.Implementing a novel PEWS in acute care settings would require training of healthcare workers and guidance on what to do based on a certain score.Furthermore, in clinical settings the ED-PEWS would become isolated from subsequent measures.In implementation it is important to look if this early warning score should be added to an existing triage system and which consequences should be assigned after the use of the ED-PEWS.In parallel to the performance of the ED-PEWS, this study also gives methodological insights in showing how data from such different sites can be combined in future prospective studies.The study clearly demonstrates that it is important to take in account the huge variability between settings in LMICs.
In addition, including more centres in the evaluation of the ED-PEWS could provide more insight in the performance of this early warning score in the various acute care settings.In primary care settings, optimising the ED-PEWS is needed to improve discrimination.In addition, further research should focus on novel predictors and other contextual variables influencing the predictive values of the vital parameters in these rural areas, such as access to health care and nutritional status.This further research could give us more insight in the differences between the performances of the various settings.Additionally, research on the influence of social and parental determinants on these factors could provide more understanding in the variability of these settings.